Optimizing energy consumption for robot navigation in fields requires energy-cost maps. However, obtaining such a map is still challenging, especially for large, uneven terrains. Physics-based energy models work for uniform, flat surfaces but do not generalize well to these terrains. Furthermore, slopes make the energy consumption at every location directional and add to the complexity of data collection and energy prediction. In this paper, we address these challenges in a data-driven manner. We consider a function which takes terrain geometry and robot motion direction as input and outputs expected energy consumption. The function is represented as a ResNet-based neural network whose parameters are learned from field-collected data. The prediction accuracy of our method is within 12% of the ground truth in our test environments that are unseen during training. We compare our method to a baseline method in the literature: a method using a basic physics-based model. We demonstrate that our method significantly outperforms it by more than 10% measured by the prediction error. More importantly, our method generalizes better when applied to test data from new environments with various slope angles and navigation directions.
translated by 谷歌翻译
通信搜索是刚性点云注册算法中的重要步骤。大多数方法在每个步骤都保持单个对应关系,并逐渐删除错误的通信。但是,建立一对一的对应关系非常困难,尤其是当将两个点云与许多本地功能匹配时。本文提出了一种优化方法,该方法在将部分点云与完整点云匹配时保留每个关键点的所有可能对应关系。然后,通过考虑匹配成本,这些不确定的对应关系通过估计的刚性转换逐渐更新。此外,我们提出了一个新的点功能描述符,该描述符衡量本地点云区域之间的相似性。广泛的实验表明,即使在同一类别中与不同对象匹配时,我们的方法也优于最先进的方法(SOTA)方法。值得注意的是,我们的方法在将真实世界的噪声深度图像注册为模板形状时的表现优于SOTA方法。
translated by 谷歌翻译
视觉宣传活动的挑战性输入设置之一是,当初始摄像头视图相距甚远时。这样的设置很困难,因为宽的基线会导致物体外观发生巨大变化并引起阻塞。本文为宽基线图像提供了一种新颖的自我监督的视觉伺服伺服方法,这不需要3D地面真相监督。回归绝对相机相对于对象的现有方法需要以3D边界框或网格的形式的对象的3D地面真实数据。我们通过利用称为3D均衡的几何特性来了解连贯的视觉表示形式 - 表示表示作为3D转换的函数以可预测的方式进行转换。为了确保功能空间忠实于基础的大地测量空间,地球保留的约束与均衡相结合。我们设计了一个暹罗网络,该网络可以有效地强制执行这两个几何特性,而无需3D监督。借助学习的模型,可以简单地通过在学习空间中的梯度并用作闭环视觉陶器的反馈来推断相对转换。我们的方法对来自YCB数据集的对象进行了评估,在使用3D监督的最新方法方面显示了视觉伺服任务上有意义的超越性能或对象对齐任务。我们的平均距离误差降低超过35%,成功率超过90%,误差耐受性。
translated by 谷歌翻译
从现实世界中的图像(例如果园)中估算出准确可靠的水果和蔬菜计数,这是一个充满挑战的问题,最近引起了最近的关注。收获前估算水果计数为物流规划提供了有用的信息。尽管在水果检测方面已取得了很大进展,但估计实际计数仍然具有挑战性。实际上,水果通常聚集在一起。因此,仅检测水果的方法无法提供一般解决方案来估计准确的水果计数。此外,在园艺研究中,而不是单一的屈服估计中,更细致的信息,例如每个集群的苹果数量分布。在这项工作中,我们将图像从图像计算为多类分类问题,并通过训练卷积神经网络来解决它。我们首先评估方法的每图像精度,并将其与基于四个测试数据集的高斯混合模型的最先进方法进行比较。即使针对每个数据集专门调整了基于高斯混合模型的方法的参数,但我们的网络在四分之三数据集中的三个数据集中的表现最高为94 \%精度。接下来,我们使用该方法来估计两个数据集的产量,我们为此提供了真理。我们的方法达到了96-97 \%精度。有关更多详细信息,请在此处查看我们的视频:https://www.youtube.com/watch?v=le0mb5p-syc} {https://www.youtube.com/watch?v=le0mb5p-syc。
translated by 谷歌翻译
我们提出了一个通用框架,用于使用安装在机器人操纵器上的相机在农场设置中准确定位传感器和最终效应器。我们的主要贡献是一种基于新的且可靠的功能跟踪算法的视觉致密伺服方法。在苹果园进行的现场实验的结果表明,即使在环境影响下,我们的方法也会收敛到给定的终止标准,例如强风,不同的照明条件和目标对象的部分遮挡。此外,我们通过实验表明,对于广泛的初始条件,系统会收敛到所需的视图。这种方法为新应用提供了可能性,例如自动化水果检查,水果采摘或精确的农药应用。
translated by 谷歌翻译
从单个视图图像重建以公制级别的人的3D姿势是一个几何上不成不良的问题。例如,我们不能从单个视图图像测量人对相机的确切距离,而无需额外的场景假设(例如,已知高度)。基于学习的基于学习方法通​​过重建3D构成来规避此问题。然而,有许多应用如虚拟遥读,机器人和增强现实,需要公制量表重建。在本文中,我们示出了与图像一起记录的音频信号,提供互补信息以重建人的度量3D姿势。关键识别是,作为横跨3D空间遍历的音频信号,它们与身体的交互提供有关身体姿势的度量信息。基于这种洞察力,我们介绍了一个称为姿势内核的时间不变传递函数 - 由身体姿势引起的音频信号的脉冲响应。姿势内核的主要属性是(1)其信封与3D姿势高度相关,(2)时间响应对应于到达时间,指示与麦克风的度量距离,(3)它是不变的场景几何配置。因此,它易于概括到看不见的场景。我们设计了一种多级3D CNN,其融合了音频和视觉信号,并学习以公制量表重建3D姿势。我们表明,我们的多模态方法在现实世界场景中产生了准确的公制重建,这是最先进的提升方法,包括参数网回归和深度回归。
translated by 谷歌翻译
用于放牧牛的土地占据了美国土地的三分之一。这些区域可以非常坚固。然而,他们需要维持,以防止杂草接管营养草地。这可能是一种艰巨的任务,特别是在有机养殖的情况下,因为不能使用除草剂。在本文中,我们展示了Cowbot的设计,是一种牧场的自主杂草割草机。牛仔是一架电动割草机,旨在在牛牧场上的崎岖环境中运行,并为有机农场提供杂草控制的成本效益。由于牧场的杂草分布未知,牛仔队的路径规划是挑战性的。鉴于有限的视野,在线路径规划是必要的,以检测杂草和计划割草的路径。我们研究了具有曲率和视野约束的自主割草机的一般在线路径规划问题。我们开发两个在线路径规划算法,能够利用有关杂草的新信息来优化路径长度并确保覆盖范围。我们部署了在电流的电流和执行现场实验,以验证我们的实时路径规划方法的适用性。与基线Boustrophedon和基于随机搜索的覆盖路径相比,我们还执行广泛的仿真实验,表明我们的算法导致路径长度降低高达60%。
translated by 谷歌翻译
Being able to grasp objects is a fundamental component of most robotic manipulation systems. In this paper, we present a new approach to simultaneously reconstruct a mesh and a dense grasp quality map of an object from a depth image. At the core of our approach is a novel camera-centric object representation called the "object shell" which is composed of an observed "entry image" and a predicted "exit image". We present an image-to-image residual ConvNet architecture in which the object shell and a grasp-quality map are predicted as separate output channels. The main advantage of the shell representation and the corresponding neural network architecture, ShellGrasp-Net, is that the input-output pixel correspondences in the shell representation are explicitly represented in the architecture. We show that this coupling yields superior generalization capabilities for object reconstruction and accurate grasp quality estimation implicitly considering the object geometry. Our approach yields an efficient dense grasp quality map and an object geometry estimate in a single forward pass. Both of these outputs can be used in a wide range of robotic manipulation applications. With rigorous experimental validation, both in simulation and on a real setup, we show that our shell-based method can be used to generate precise grasps and the associated grasp quality with over 90% accuracy. Diverse grasps computed on shell reconstructions allow the robot to select and execute grasps in cluttered scenes with more than 93% success rate.
translated by 谷歌翻译
Training of a Machine Learning model requires sufficient data. The sufficiency of the data is not always about the quantity, but about the relevancy and reduced redundancy. Data-generating processes create massive amounts of data. When used raw, such big data is causing much computational resource utilization. Instead of using the raw data, a proper Condensed Representation can be used instead. Combining K-means, a well-known clustering method, with some correction and refinement facilities a novel Condensed Representation method for Machine Learning applications is introduced. To present the novel method meaningfully and visually, synthetically generated data is employed. It has been shown that by using the condensed representation, instead of the raw data, acceptably accurate model training is possible.
translated by 谷歌翻译
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(\sigma / \sqrt{T})$ convergence when the oracle feedback is stochastic with variance $\sigma^2$, and improves its convergence to $O( 1 / T^3)$ with deterministic oracles, where $T$ is the number of iterations. Our method also interpolates these rates without knowing the nature of the oracle apriori, which is enabled by a parameter-free adaptive step-size that is oblivious to the knowledge of smoothness modulus, variance bounds and the diameter of the constrained set. To our knowledge, this is the first universal algorithm with such global guarantees within the second-order optimization literature.
translated by 谷歌翻译